Versions:

  • 0.3.5
  • 0.3.4
  • 0.3.3
  • 0.3.2
  • 0.3.1
  • 0.1.0

Grid 0.3.5, the sixth public release of The Inference Grid’s command-line client, is a lightweight Windows utility that lets machine-learning engineers, data scientists, and DevOps teams attach local or cloud GPUs to the Grid’s worldwide inference clearinghouse without leaving the terminal. Designed for the AI & Machine-Learning category, the tool authenticates with a single token, discovers optimal compute endpoints, and streams model weights, tokenizers, and runtime images on demand, turning spare accelerator hours into a shared, low-latency inference fabric. Typical use cases include off-loading large language-model requests from overloaded on-prem servers, spinning up spot-instance GPU workers that auto-register with the Grid’s matchmaking engine, benchmarking throughput across multiple geographic regions, and chaining autoscaling containers into serverless pipelines for computer-vision or NLP micro-services. Version 0.3.5 adds incremental model caching, improved CUDA 12 compatibility, quieter progress bars for CI logs, and a dry-run flag that previews resource allocation before committing hardware. Earlier iterations introduced secure enclave attestation (0.2.x), multi-arch container manifests (0.3.0), and dynamic batch-size negotiation (0.3.3), giving users a concise upgrade path that preserves existing job manifests and credential stores. Because the executable is self-contained, it can be scripted inside PowerShell, Ansible, or GitHub Actions workflows to provision ephemeral inference nodes that unregister automatically when jobs complete. The software is available for free on get.nero.com, with downloads provided via trusted Windows package sources (e.g. winget), always delivering the latest version, and supporting batch installation of multiple applications.

Tags: